perm filename VISION[F79,JMC] blob sn#489997 filedate 1979-12-16 generic text, type T, neo UTF8
	Our object is to detail the capabilities of a program
with human-level performance in manipulating, learning about,
and communicating about situations containing physical
objects. We must be careful not to ask for capabilities not
required for the tasks humans can actually do, because they
may require information or computing capability not actually
available to the computer.  We note that human capability
depends on whether the scene is available for manipulation
and vision, e.g. the questions that can be answered when
a scene is visible are more extensive than those that can
be answered from a verbal description.  Moreover, physical
manipulation of a scene uses information that is not
verbalized and much of which probably cannot be verbalized -
recognizing that specialists such as photo-interpreters
or Bertillon system experts can verbalize information
that most people cannot.

	Perhaps the most straightforward problem is answering
questions about a scene from a verbal description.